15 research outputs found
Partitions of R^n with Maximal Seclusion and their Applications to Reproducible Computation
We introduce and investigate a natural problem regarding unit cube tilings/partitions of Euclidean space and also consider broad generalizations of this problem. The problem fits well within a historical context of similar problems and also has applications to the study of reproducibility in randomized computation.
Given and , we define a -secluded unit cube partition of to be a unit cube partition of such that for every point , the closed -ball around intersects at most cubes. The problem is to construct such partitions for each dimension with the primary goal of minimizing and the secondary goal of maximizing .
We prove that for every dimension , there is an explicit and efficiently computable -secluded axis-aligned unit cube partition of with and . We complement this construction by proving that for axis-aligned unit cube partitions, the value of is the minimum possible, and when is minimized at , the value is the maximum possible. This demonstrates that our constructions are the best possible.
We also consider the much broader class of partitions in which every member has at most unit volume and show that is still the minimum possible. We also show that for any reasonable (i.e. ), it must be that . This demonstrates that when is minimized at , our unit cube constructions are optimal to within a logarithmic factor even for this broad class of partitions. In fact, they are even optimal in up to a logarithmic factor when is allowed to be polynomial in .
We extend the techniques used above to introduce and prove a variant of the KKM lemma, the Lebesgue covering theorem, and Sperner\u27s lemma on the cube which says that for every , and every proper coloring of , there is a translate of the -ball which contains points of least different colors.
Advisers: N. V. Vinodchandran & Jamie Radcliff
Comparing Powers of Edge Ideals
Given a nontrivial homogeneous ideal , a
problem of great recent interest has been the comparison of the th ordinary
power of and the th symbolic power .
This comparison has been undertaken directly via an exploration of which
exponents and guarantee the subset containment
and asymptotically via a computation of the resurgence , a number for
which any guarantees .
Recently, a third quantity, the symbolic defect, was introduced; as
, the symbolic defect is the minimal number of generators
required to add to in order to get .
We consider these various means of comparison when is the edge ideal of
certain graphs by describing an ideal for which .
When is the edge ideal of an odd cycle, our description of the structure
of yields solutions to both the direct and asymptotic containment
questions, as well as a partial computation of the sequence of symbolic
defects.Comment: Version 2: Revised based on referee suggestions. Lemma 5.12 was added
to clarify the proof of Theorem 5.13. To appear in the Journal of Algebra and
its Applications. Version 1: 20 pages. This project was supported by Dordt
College's undergraduate research program in summer 201
List and Certificate Complexities in Replicable Learning
We investigate replicable learning algorithms. Ideally, we would like to
design algorithms that output the same canonical model over multiple runs, even
when different runs observe a different set of samples from the unknown data
distribution. In general, such a strong notion of replicability is not
achievable. Thus we consider two feasible notions of replicability called list
replicability and certificate replicability. Intuitively, these notions capture
the degree of (non) replicability. We design algorithms for certain learning
problems that are optimal in list and certificate complexity. We establish
matching impossibility results
Geometry of Rounding: Near Optimal Bounds and a New Neighborhood Sperner's Lemma
A partition of is called a
-secluded partition if, for every ,
the ball intersects at most
members of . A goal in designing such secluded partitions is to
minimize while making as large as possible. This partition
problem has connections to a diverse range of topics, including deterministic
rounding schemes, pseudodeterminism, replicability, as well as Sperner/KKM-type
results.
In this work, we establish near-optimal relationships between and
. We show that, for any bounded measure partitions and for any
, it must be that . Thus, when is
restricted to , it follows that . This bound is tight up to log factors, as it is
known that there exist secluded partitions with and
. We also provide new constructions of secluded
partitions that work for a broad spectrum of and
parameters. Specifically, we prove that, for any
, there is a secluded partition with
and
. These new partitions are optimal up to
factors for various choices of and . Based
on the lower bound result, we establish a new neighborhood version of Sperner's
lemma over hypercubes, which is of independent interest. In addition, we prove
a no-free-lunch theorem about the limitations of rounding schemes in the
context of pseudodeterministic/replicable algorithms
Neighborhood Variants of the KKM Lemma, Lebesgue Covering Theorem, and Sperner's Lemma on the Cube
We establish a "neighborhood" variant of the cubical KKM lemma and the
Lebesgue covering theorem and deduce a discretized version which is a
"neighborhood" variant of Sperner's lemma on the cube. The main result is the
following: for any coloring of the unit -cube in which points on
opposite faces must be given different colors, and for any ,
there is an -ball which contains points of at least
different colors, (so in particular,
at least different colors for all sensible
).Comment: 18 pages plus appendices (30 pages total), 3 figure
Geometry of Rounding
Rounding has proven to be a fundamental tool in theoretical computer science.
By observing that rounding and partitioning of are equivalent,
we introduce the following natural partition problem which we call the {\em
secluded hypercube partition problem}: Given (ideally small)
and (ideally large), is there a partition of with
unit hypercubes such that for every point , its closed
-neighborhood (in the norm) intersects at most
hypercubes?
We undertake a comprehensive study of this partition problem. We prove that
for every , there is an explicit (and efficiently computable)
hypercube partition of with and . We complement this construction by proving that the value of
is the best possible (for any ) for a broad class of
``reasonable'' partitions including hypercube partitions. We also investigate
the optimality of the parameter and prove that any partition in this
broad class that has , must have .
These bounds imply limitations of certain deterministic rounding schemes
existing in the literature. Furthermore, this general bound is based on the
currently known lower bounds for the dissection number of the cube, and
improvements to this bound will yield improvements to our bounds.
While our work is motivated by the desire to understand rounding algorithms,
one of our main conceptual contributions is the introduction of the {\em
secluded hypercube partition problem}, which fits well with a long history of
investigations by mathematicians on various hypercube partitions/tilings of
Euclidean space
Epigenome Wide Association Study of SNP–CpG Interactions on Changes in Triglyceride Levels after Pharmaceutical Intervention: A GAW20 Analysis
In the search for an understanding of how genetic variation contributes to the heritability of common human disease, the potential role of epigenetic factors, such as methylation, is being explored with increasing frequency. Although standard analyses test for associations between methylation levels at individual cytosine-phosphateguanine (CpG) sites and phenotypes of interest, some investigators have begun testing for methylation and how methylation may modulate the effects of genetic polymorphisms on phenotypes. In our analysis, we used both a genome-wide and candidate gene approach to investigate potential single-nucleotide polymorphism (SNP)–CpG interactions on changes in triglyceride levels. Although we were able to identify numerous loci of interest when using an exploratory significance threshold, we did not identify any significant interactions using a strict genomewide significance threshold. We were also able to identify numerous loci using the candidate gene approach, in which we focused on 18 genes with prior evidence of association of triglyceride levels. In particular, we identified GALNT2 loci as containing potential CpG sites that moderate the impact of genetic polymorphisms on triglyceride levels. Further work is needed to provide clear guidance on analytic strategies for testing SNP–CpG interactions, although leveraging prior biological understanding may be needed to improve statistical power in data sets with smaller sample sizes
Evaluating the Performance of Gene-Based Tests of Genetic Association when Testing for Association Between Methylation and Change in Triglyceride Levels at GAW20
Although methylation data continues to rise in popularity, much is still unknown about how to best analyze methylation data in genome-wide analysis contexts. Given continuing interest in gene-based tests for next-generation sequencing data, we evaluated the performance of novel gene-based test statistics on simulated data from GAW20. Our analysis suggests that most of the gene-based tests are detecting real signals and maintaining the Type I error rate. The minimum pvalue and threshold-based tests performed well compared to single-marker tests in many cases, especially when the number of variants was relatively large with few true causal variants in the set
A Genome-Wide Association Study of Red-Blood Cell Fatty Acids and Ratios Incorporating Dietary Covariates: Framingham Heart Study Offspring Cohort
Recent analyses have suggested a strong heritable component to circulating fatty acid (FA) levels; however, only a limited number of genes have been identified which associate with FA levels. In order to expand upon a previous genome wide association study done on participants in the Framingham Heart Study Offspring Cohort and FA levels, we used data from 2,400 of these individuals for whom red blood cell FA profiles, dietary information and genotypes are available, and then conducted a genome-wide evaluation of potential genetic variants associated with 22 FAs and 15 FA ratios, after adjusting for relevant dietary covariates. Our analysis found nine previously identified loci associated with FA levels (FADS, ELOVL2, PCOLCE2, LPCAT3, AGPAT4, NTAN1/PDXDC1, PKD2L1, HBS1L/MYB and RAB3GAP1/MCM6), while identifying four novel loci. The latter include an association between variants in CALN1 (Chromosome 7) and eicosapentaenoic acid (EPA), DHRS4L2(Chromosome 14) and a FA ratio measuring delta-9-desaturase activity, as well as two loci associated with less well understood proteins. Thus, the inclusion of dietary covariates had a modest impact, helping to uncover four additional loci. While genome-wide association studies continue to uncover additional genes associated with circulating FA levels, much of the heritable risk is yet to be explained, suggesting the potential role of rare genetic variation, epistasis and gene-environment interactions on FA levels as well. Further studies are needed to continue to understand the complex genetic picture of FA metabolism and synthesis